Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 10 de 10
Filtrar
Mais filtros










Intervalo de ano de publicação
1.
Artigo em Inglês | MEDLINE | ID: mdl-37037246

RESUMO

Recently, multiagent reinforcement learning (MARL) has shown great potential for learning cooperative policies in multiagent systems (MASs). However, a noticeable drawback of current MARL is the low sample efficiency, which causes a huge amount of interactions with environment. Such amount of interactions greatly hinders the real-world application of MARL. Fortunately, effectively incorporating experience knowledge can assist MARL to quickly find effective solutions, which can significantly alleviate the drawback. In this article, a novel multiexperience-assisted reinforcement learning (MEARL) method is proposed to improve the learning efficiency of MASs. Specifically, monotonicity-constrained reward shaping is innovatively designed using expert experience to provide additional individual rewards to guide multiagent learning efficiently, with the invariance guarantee of the team optimization objective. Furthermore, a reward distribution estimator is specially developed to model an implicated reward distribution of environment by using transition experience from environment, containing collected samples (state-action pair, reward, and next state). This estimator can predict the expectation reward of each agent for the taken action to accurately estimate the state value function and accelerate its convergence. Besides, the performance of MEARL is evaluated on two multiagent environment platforms: our designed unmanned aerial vehicle combat (UAV-C) and StarCraft II Micromanagement (SCII-M). Simulation results demonstrate that the proposed MEARL can greatly improve the learning efficiency and performance of MASs and is superior to the state-of-the-art methods in multiagent tasks.

2.
IEEE Trans Neural Netw Learn Syst ; 34(11): 8235-8249, 2023 Nov.
Artigo em Inglês | MEDLINE | ID: mdl-35180087

RESUMO

In this article, a novel method, called attention enhanced reinforcement learning (AERL), is proposed to address issues including complex interaction, limited communication range, and time-varying communication topology for multi agent cooperation. AERL includes a communication enhanced network (CEN), a graph spatiotemporal long short-term memory network (GST-LSTM), and parameters sharing multi-pseudo critic proximal policy optimization (PS-MPC-PPO). Specifically, CEN based on graph attention mechanism is designed to enlarge the agents' communication range and to deal with complex interaction among the agents. GST-LSTM, which replaces the standard fully connected (FC) operator in LSTM with graph attention operator, is designed to capture the temporal dependence while maintaining the spatial structure learned by CEN. PS-MPC-PPO, which extends proximal policy optimization (PPO) in multi agent systems with parameters' sharing to scale to environments with a large number of agents in training, is designed with multi-pseudo critics to mitigate the bias problem in training and accelerate the convergence process. Simulation results for three groups of representative scenarios including formation control, group containment, and predator-prey games demonstrate the effectiveness and robustness of AERL.

3.
J Sports Sci ; 40(14): 1629-1640, 2022 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-35793267

RESUMO

Characterizing playing style is important for football clubs on scouting, monitoring and match preparation. Previous studies considered a player's style as a combination of technical performances, failing to consider the spatial information. Therefore, this study aimed to characterize the playing styles of each playing position in the Chinese Football Super League (CSL) matches, integrating a recently adopted Player Vectors framework. Data of 960 matches from 2016-2019 CSL were used. Match ratings, and 10 types of match events with the corresponding coordinates for all the line-up players whose on-pitch time exceeded 45 minutes were extracted. Players were first clustered into eight positions. A player vector was constructed for each player in each match based on the Player Vectors using Nonnegative Matrix Factorization (NMF). Another NMF process was run on the player vectors to extract different types of playing styles. The resulting player vectors discovered 18 different playing styles in the CSL. Six performance indicators of each style were investigated to observe their contributions. In general, the playing styles of forwards and midfielders are in line with football performance evolution trends, while the styles of defenders should be reconsidered. Multifunctional playing styles were also found in high-rated CSL players.


Assuntos
Desempenho Atlético , Futebol , Humanos , China
4.
Front Psychol ; 13: 899199, 2022.
Artigo em Inglês | MEDLINE | ID: mdl-35719541

RESUMO

Establishing and illustrating a predictive and prescriptive model of playing styles that football teams adopt during matches is a key step toward describing and measuring the effectiveness of styles of play. The current study aimed to identify and measure the effectiveness of different defensive playing styles for professional football teams considering the opponent's expected goal. Event data of all 1,120 matches played in the Chinese Football Super League (CSL) from the 2016 to 2020 seasons were collected, with fifteen defense-related performance variables being extracted. The PCA model (KMO = 0.76) output eight factors that represented 7 different styles of play (factor 6 and 8 represent one style of play) and explained 85.17% of the total variance. An expected goal (xG) model was built using data related to 27,852 shots. Finally, the xG of the opponent was calculated in the multivariate regression model, outputting five factors that (p < 0.05) explained 41.6% of the total variance in the xG of the opponent and receiving a dangerous situation (factor 7) was the most apparent style (31.3%). Finally, the predicted model with defensive styles correlated with actual xG of the opponent at r = 0.62 using the 2020 season as testing data which showed that the predicted xG was correlated moderately with the actual. The result indicated that if the team strengthened the defense closed to the own goal, high intensity confrontation, and defense of goalkeeper, meanwhile making less errors and receiving less dangerous situations, the xG of the opponent would be greatly reduced.

5.
IEEE Trans Cybern ; 52(7): 6809-6821, 2022 Jul.
Artigo em Inglês | MEDLINE | ID: mdl-33301412

RESUMO

This article presents a new command-filtered composite adaptive neural control scheme for uncertain nonlinear systems. Compared with existing works, this approach focuses on achieving finite-time convergent composite adaptive control for the higher-order nonlinear system with unknown nonlinearities, parameter uncertainties, and external disturbances. First, radial basis function neural networks (NNs) are utilized to approximate the unknown functions of the considered uncertain nonlinear system. By constructing the prediction errors from the serial-parallel nonsmooth estimation models, the prediction errors and the tracking errors are fused to update the weights of the NNs. Afterward, the composite adaptive neural backstepping control scheme is proposed via nonsmooth command filter and adaptive disturbance estimation techniques. The proposed control scheme ensures that high-precision tracking performances and NN approximation performances can be achieved simultaneously. Meanwhile, it can avoid the singularity problem in the finite-time backstepping framework. Moreover, it is proved that all signals in the closed-loop control system can be convergent in finite time. Finally, simulation results are given to illustrate the effectiveness of the proposed control scheme.


Assuntos
Redes Neurais de Computação , Dinâmica não Linear , Simulação por Computador , Retroalimentação
6.
Opt Express ; 29(16): 25142-25160, 2021 Aug 02.
Artigo em Inglês | MEDLINE | ID: mdl-34614852

RESUMO

Millimeter-wave (MMW) imaging is becoming an important option in many sensing applications. However, the resulting images are often plagued with artifacts caused by complex target scenarios such as concave structures, hampering applications where precise recognition is emphasized. It has been shown that existing imaging techniques can effectively resolve this issue by considering the multi-reflection propagation process in the forward model of the inverse problem. But the accuracy of such method still depends on the precise separation of reflected signals exhibiting different number of interactions with the target surfaces. In this article, an improved imaging technique based on circular polarizations is proposed for accurate imaging of concave objects. By utilizing circular polarized measurements, the received signal can be divided into odd and even number of reflection times. Then, an iterative reconstruction technique is introduced to automatically separate signal components and reconstruct precise contours of the concave surfaces. Furthermore, a strict observation angle boundary model is derived based on methods of the stationary phase to correct the image deformation of edges existing in previous algorithms. Both numerical and experimental results synthesized from 6∼18 GHz dual-polarized measurements are used to demonstrate the improved accuracy and automation of the proposed method.

7.
IEEE Trans Neural Netw Learn Syst ; 32(6): 2358-2372, 2021 Jun.
Artigo em Inglês | MEDLINE | ID: mdl-32673195

RESUMO

Generating collision-free, time-efficient paths in an uncertain dynamic environment poses huge challenges for the formation control with collision avoidance (FCCA) problem in a leader-follower structure. In particular, the followers have to take both formation maintenance and collision avoidance into account simultaneously. Unfortunately, most of the existing works are simple combinations of methods dealing with the two problems separately. In this article, a new method based on deep reinforcement learning (RL) is proposed to solve the problem of FCCA. Especially, the learning-based policy is extended to the field of formation control, which involves a two-stage training framework: an imitation learning (IL) and later an RL. In the IL stage, a model-guided method consisting of a consensus theory-based formation controller and an optimal reciprocal collision avoidance strategy is designed to speed up training and increase efficiency. In the RL stage, a compound reward function is presented to guide the training. In addition, we design a formation-oriented network structure to perceive the environment. Long short-term memory is adopted to enable the network structure to perceive the information of obstacles of an uncertain number, and a transfer training approach is adopted to improve the generalization of the network in different scenarios. Numerous representative simulations are conducted, and our method is further deployed to an experimental platform based on a multiomnidirectional-wheeled car system. The effectiveness and practicability of our proposed method are validated through both the simulation and experiment results.

8.
IEEE Trans Cybern ; 51(5): 2504-2517, 2021 May.
Artigo em Inglês | MEDLINE | ID: mdl-31329154

RESUMO

This paper presents a novel robust adaptive tracking control method for a hypersonic vehicle in a cruise flight stage based on interval type-2 fuzzy-logic system (IT2-FLS) and small-gain approach. After the input-output linearization, the vehicle model can be decomposed into two uncertain subsystems by considering matching disturbances and parametric uncertainties. For each subsystem, an interval type-2 Takagi-Sugeno-Kang fuzzy logic system (IT2-TSK-FLS) is then employed to approximate the unavailable model information. Following the idea of a small-gain approach, a composite feedback form for each subsystem is constructed, based on which the final robust adaptive tracking control law is developed. Rigorous stability analysis shows that all signals in the derived closed-loop system are kept uniformly ultimately bounded (UUB). The main contribution of this paper is that the proposed control law for the hypersonic vehicle is with only two adaptive parameters in total which can greatly alleviate the computation and storage burden in practice; meanwhile its superiority over the conventional minimal-learning-parameter (MLP)-based one is specifically illustrated. Comparative numerical simulations of three cases demonstrate the effectiveness of our proposed control method with respect to complicated uncertainties.

9.
Braz J Cardiovasc Surg ; 33(4): 384-390, 2018.
Artigo em Inglês | MEDLINE | ID: mdl-30184036

RESUMO

OBJECTIVE: This study aimed to investigate the protective effects of baicalin on myocardial infarction in rats and explore the related mechanisms. METHODS: Fifty Sprague Dawley rats were randomly divided into the control, model, and low-, medium- and high-dose baicalin groups. The latter 3 groups were intraperitoneally injected with baicalin, with a dose of 12.5, 25 and 50 mg/kg, respectively. Then, the myocardial infarction model was established. The hemodynamic of rats was tested, the serum lactate dehydrogenase (LDH), creatine kinase-MB (CK-MB), prostacyclin (PGI2) and thromboxane A2 (TXA2) were determined, the myocardial superoxide dismutase (SOD) and malondialdehyde (MDA) levels were detected, and the myocardial B-cell lymphoma-2 (Bcl-2) and Bcl-2 associated X (Bax) protein expressions were determined. RESULTS: Compared with the model group, in the high-dose baicalin group the ST segment height and LVEDP were significantly decreased (P<0.05), the LVSP was significantly increased (P<0.05), the serum LDH, CK-MB and TXA2 levels were significantly decreased (P<0.05), the PGI2 level was significantly increased (P<0.05), the myocardial SOD level was significantly increased (P<0.05), and the myocardial MDA level was significantly decreased (P<0.05); the myocardial Bcl-2 protein level was significantly increased, and the Bax protein level was significantly decreased (P<0.05). CONCLUSION: Baicalin has protective effects on myocardial infarction in rats. The possible mechanisms may be related to its resistance to oxidative stress, and up-regulation of Bcl-2 protein expression and down-regulation of Bax protein expression in myocardial tissue.


Assuntos
Flavonoides/farmacologia , Infarto do Miocárdio/prevenção & controle , Substâncias Protetoras/farmacologia , Animais , Cromatografia Líquida de Alta Pressão , Creatina Quinase Forma MB/sangue , Ensaio de Imunoadsorção Enzimática , Epoprostenol/sangue , Genes bcl-2 , Hemodinâmica/efeitos dos fármacos , L-Lactato Desidrogenase/sangue , Malondialdeído/análise , Distribuição Aleatória , Ratos Sprague-Dawley , Valores de Referência , Reprodutibilidade dos Testes , Superóxido Dismutase/análise , Tromboxano A2/sangue , Resultado do Tratamento , Proteína X Associada a bcl-2/análise
10.
Rev. bras. cir. cardiovasc ; 33(4): 384-390, July-Aug. 2018. tab, graf
Artigo em Inglês | LILACS | ID: biblio-958430

RESUMO

Abstract Objective: This study aimed to investigate the protective effects of baicalin on myocardial infarction in rats and explore the related mechanisms. Methods: Fifty Sprague Dawley rats were randomly divided into the control, model, and low-, medium- and high-dose baicalin groups. The latter 3 groups were intraperitoneally injected with baicalin, with a dose of 12.5, 25 and 50 mg/kg, respectively. Then, the myocardial infarction model was established. The hemodynamic of rats was tested, the serum lactate dehydrogenase (LDH), creatine kinase-MB (CK-MB), prostacyclin (PGI2) and thromboxane A2 (TXA2) were determined, the myocardial superoxide dismutase (SOD) and malondialdehyde (MDA) levels were detected, and the myocardial B-cell lymphoma-2 (Bcl-2) and Bcl-2 associated X (Bax) protein expressions were determined. Results: Compared with the model group, in the high-dose baicalin group the ST segment height and LVEDP were significantly decreased (P<0.05), the LVSP was significantly increased (P<0.05), the serum LDH, CK-MB and TXA2 levels were significantly decreased (P<0.05), the PGI2 level was significantly increased (P<0.05), the myocardial SOD level was significantly increased (P<0.05), and the myocardial MDA level was significantly decreased (P<0.05); the myocardial Bcl-2 protein level was significantly increased, and the Bax protein level was significantly decreased (P<0.05). Conclusion: Baicalin has protective effects on myocardial infarction in rats. The possible mechanisms may be related to its resistance to oxidative stress, and up-regulation of Bcl-2 protein expression and down-regulation of Bax protein expression in myocardial tissue.


Assuntos
Animais , Flavonoides/farmacologia , Substâncias Protetoras/farmacologia , Infarto do Miocárdio/prevenção & controle , Valores de Referência , Superóxido Dismutase/análise , Tromboxano A2/sangue , Ensaio de Imunoadsorção Enzimática , Distribuição Aleatória , Reprodutibilidade dos Testes , Cromatografia Líquida de Alta Pressão , Epoprostenol/sangue , Resultado do Tratamento , Ratos Sprague-Dawley , Genes bcl-2 , Creatina Quinase Forma MB/sangue , Proteína X Associada a bcl-2/análise , Hemodinâmica/efeitos dos fármacos , L-Lactato Desidrogenase/sangue , Malondialdeído/análise
SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...